Predictive Off-Policy Policy Evaluation for Nonstationary Decision Problems, with Applications to Digital Marketing
نویسندگان
چکیده
In this paper we consider the problem of evaluating one digital marketing policy (or more generally, a policy for an MDP with unknown transition and reward functions) using data collected from the execution of a different policy. We call this problem off-policy policy evaluation. Existing methods for off-policy policy evaluation assume that the transition and reward functions of the MDP are stationary—an assumption that is typically false, particularly for digital marketing applications. This means that existing off-policy policy evaluation methods are reactive to nonstationarity, in that they slowly correct for changes after they occur. We argue that off-policy policy evaluation for nonstationary MDPs can be phrased as a time series prediction problem, which results in predictive methods that can anticipate changes before they happen. We therefore propose a synthesis of existing off-policy policy evaluation methods with existing time series prediction methods, which we show results in a drastic reduction of mean squared error when evaluating policies using real digital marketing data set.
منابع مشابه
A new last aggregation compromise solution approach based on TOPSIS method with hesitant fuzzy setting to energy policy evaluation
Utilizing renewable energies is identified as one of significant issues for economical and social significance in future human life. Thus, choosing the best renewable energy among renewable energy candidates is more important. To address the issue, multi-criteria group decision making (MCGDM) methods with imprecise information could be employed to solve these problems. The aim of this paper is ...
متن کاملDigital Direct-to-Consumer Advertising: A Perfect Storm of Rapid Evolution and Stagnant Regulation; Comment on “Trouble Spots in Online Direct-to-Consumer Prescription Drug Promotion: A Content Analysis of FDA Warning Letters”
The adoption and use of digital forms of direct-to-consumer advertising (also known as “eDTCA”) is on the rise. At the same time, the universe of eDTCA is expanding, as technology on Internet-based platforms continues to evolve, from static websites, to social media, and nearly ubiquitous use of mobile devices. However, little is known about how this unique form of pharmaceutical marketing impa...
متن کاملDelving Into the Details of Evaluating Public Engagement Initiatives; Comment on “Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review”
Initiatives to engage the public in health policy decisions have been widely endorsed and used, yet agreed upon methods for systematically evaluating the effectiveness of these initiatives remain to be developed. Dukhanin, Topazian, and DeCamp have thus developed a useful taxonomy of evaluation criteria derived from a systematic review of published evaluation tools that might serve as the basis...
متن کاملPolicy Evaluation with Temporal Differences: A Survey and Comparison
Value functions are an essential tool for solving sequential decision making problems such as Markov decision processes (MDPs). Computing the value function for a given policy (policy evaluation) is not only important for determining the quality of the policy but also a key step in prominent policy-iteration-type algorithms. In common settings where a model of the Markov decision process is not...
متن کاملEvidence for Informing Health Policy Development in Low- Income Countries (LICS): Perspectives of Policy Actors in Uganda
Background Although there is a general agreement on the benefits of evidence informed health policy development given resource constraints especially in Low-Income Countries (LICs), the definition of what evidence is, and what evidence is suitable to guide decision-making is still unclear. Our study is contributing to filling this knowledge gap. We aimed to explore health policy actors’ views r...
متن کامل